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MULTILEVEL ENSEMBLE TRANSFORM PARTICLE FILTERING 

A. GREGORY ^ C. J. COTTER AND S. REICH * 

Abstract. This paper extends the Multilevel Monte Carlo variance reduction technique to nonlinear 
filtering. In particular, Multilevel Monte Carlo is applied to a certain variant of the particle filter, the 
Ensemble Transform Particle Eilter. A key aspect is the use of optimal transport methods to re-establish 
correlation between coarse and fine ensembles after resampling; this controls the variance of the es¬ 
timator. Numerical examples present a proof of concept of the effectiveness of the proposed method, 
demonstrating significant computational cost reductions (relative to the single-level ETPF counterpart) 
in the propagation of ensembles. 
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1. Introduction. Data assimilation is the process of incorporating observed data 
into model forecasts. In data assimilation, one is interested in computing statistics 

[X] of solutions X to random dynamical systems with respect to a posterior mea¬ 
sure (rj) given partial observations of the system. In particle filtering [7, 4], this is 
done by using an empirical ensemble representing the posterior distribution rj at any 
one time. The propagation in time of the members (particles) of this ensemble can be 
computationally expensive, especially in high dimensional systems. 

Recently, the Multilevel Monte Carlo (MLMC) method has been developed for 
achieving significant cost reductions in Monte Carlo simulations [9]. It has been ap¬ 
plied to areas such as Markov Chain Monte Carlo [13] and quasi-Monte Carlo [10] to 
return computational cost reductions from exisiting techniques. It has also been ap¬ 
plied to uncertainty quantification within PDEs [6]. The idea is to consider a hierar¬ 
chy of discretized models, balancing numerical error in cheap/coarse models against 
Monte Carlo variance in expensive/fine models. It is desirable to adapt MLMC to se¬ 
quential Monte Carlo methods such as particle filters, and some first steps have been 
taken in this direction. Firstly, [11] have developed a multilevel Ensemble Kalman 
Filter (EnKF), using MEMC estimators to calculate the mean and covariance of the 
posterior, in the case where the underlying distributions are Gaussian and the model is 
linear. However for non-Gaussian distributions and nonlinear models, the EnKF is bi¬ 
ased. The method does however converge to a “mean-field limif” [14]. Secondly, [3] 
proposed a mulfilevel sequenfial Monfe Carlo mefhod for Bayesian inference prob¬ 
lems fo give significanf compufafional cosl reductions from sfandard fechniques. Our 
goal in fhis paper is fo fake a sfep furfher by applying MEMC fo nonlinear tillering 
problems. 
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In general, MLMC works by computing statistics from pairs of coarser and finer 
correlated ensembles. For Monte Carlo simulation of SDEs, this correlation is achieved 
by using the same initial conditions and Brownian paths for each coarse/fine pair of 
ensemble members. The key challenge in applying MLMC to particle filtering is in 
maintaining this correlation after resampling. [8] suggested that correlating coarse 
and fine ensembles could be achieved by minimising fhe Wassersfein disfance be- 
fween fhe fwo ensembles. This can formulaled as a optimal fransporfafion problem 
[20]. 

In fhis paper, we adapf fhe MLMC framework fo fhe Ensemble Transform Particle 
Lilfer (ETPL) [19]. ETPL is an efficienf and effecfive nonlinear filler fhal uses opfimal 
fransporfafion Iransformalions [22] insfead of random resampling. In our Multilevel 
ETPL (MLETPE) fhe coupling belween coarse and fine ensembles is also mainlained 
using opfimal fransporfafion. The sole aim of infroducing MLMC fo fhe ETPL is lo 
reduce fhe compufalional cosl of fhe propagation of particles. This is only a benefil 
if fhe compufalional cosl dominates fhe opfimal fransporfafion Iransformalion cosl; 
whilsl direcl solvers for opfimal fransporfafion problems wilh one-dimensional sfale 
space scale as 0(^NIog{N)^, solvers for problems wilh more lhan one dimension scale 
as 0[N^log{N)) wilh fhe ensemble size. To address fhis, a technique commonly used 
in fhe aforemenfioned EnKP known as localisalion can be used fo reduce fhis opfimal 
fransporfafion cosl significanlly [5]. Our proposed MLETPE can relurn significanl 
reductions in fhe overall compufalional cosl of ETPL where fhe particle propagation 
cosl dominates. If will also relum significanl reducfions in cases where opfimal frans- 
porlalion compufalional cosl dominates, if fhe localised ETPE is used. 

This paper proceeds as follows. Seclion 2 provides a background of fhe MLMC 
mefhod, §3 describes fhe basic particle tillering framework logelher wilh fhe ETPE 
scheme. Then, fhe proposed Multilevel ETPE (MLETPE) mefhod is presented in 
§4, along wilh numerical examples lo demonslrale fhe effecliveness of fhe mefhod. 
Einally, §5 provides a summary and oullook. 

2. The Multilevel Monte Carlo Method. The Multilevel Monte Carlo estima¬ 
tor can be viewed as a variance reduction technique for a standard Monte Carlo es¬ 
timator. Suppose one wishes to compute an approximation of E[Xi], where Xi is a 
numerical approximation of a random variable X (with discretization accuracy pa¬ 
rameter' hi oc M^^). The Multilevel Monte Carlo (MLMC) method introduced in 
[9] considered the case where X is the solution to a stochastic differential equation 
(SDL) at time T > 0; the discretized solutions Xi are obtained from a given numer¬ 
ical method with stepsize hi. This paper will instead consider X to be a solution, at 
time r > 0, to a general random dynamical system, with stochastic forcing and/or 
random initial conditions (drawn from a distribution tt"). In the simplest case let X£, 

/ = !,... ,N, be X > 1 i.i.d. samples of Xi. The standard, unbiased, Monte Carlo 


'Such as the time stepsize. 
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estimator for E[Xi] is then 

(2.1) = 

(=1 


Using a telescoping sum of expectations, 

L 

(2.2) E[Xi] = E[Xo] + £E[X/] -E[X;_i], 

i=i 


one can define the MLMC approximation to E[X/,] as a sum of independent Monte 
Carlo estimators, Xi = Y!i=qXu where 


(2.3) 


leading to 


Xi 


l£l TsV. ' = 0. 

iwsjW-^r,). '>0, 


(2.4) 



Here, Ni, I = 0, ...,L, are Monte Carlo sample sizes, of i.i.d. draws from 
respectively, for each of the L + 1 estimators. We have 


(2.5) 


]E[X,] 


E[Xo], / = 0, 

E[X;]-E[Z/_i], />0, 


and hence the MLMC estimator is an unbiased approximation of E[X£]. We will call 
the estimators, X/, Z > 0, ‘multilevel difference estimators’. The important thing to 
note here is that the fine (level Z) and coarse (level Z — 1) samples in each difference 
estimator must be positively correlated for each i in each of the L + 1 multilevel 
difference estimators. This can be achieved by using the same random system input 
(e.g. initial conditions/stochastic forcing) for each Z on both levels. On the other 
hand, the samples in different difference estimators must be uncorrelated. 

For fixed T, the discretization bias (away from E[X]) of the overall estimator 
is 0{h1) [9], where a is the global discretization bias (i.e. |E[Xi] — E[X]1) of the 
numerical method used to simulate X;, Z > 0. One notes from [9] that. 


(2.6) |E[X/] -E[X,_i]| < {M-l)chf 

where c is a positive constant, and that {M — 1)^^ |E[Xl] — E[Xi_i] | can be used as an 
estimate for the overall discretization bias, \Xl — E[X]|. As each estimator in (2.4) is 
independent of one another, the overall variance is given by the sum of the variances 
of each individual estimator. Given that there is a positive correlation between X/ and 
X/_ 1 , one can expect then that the sample variance of X/ — X/_ i, denoted by 


(2.7) 


V, = V[X/ -X/_i] = V[X,] + V[X,_i] -2Cov[X/,X/_i] 
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decays at a rate proportional to I, so that V/ = 0{hf), p > 0. The covariance in the 
last term in (2.6) is taken over the joint probability distribution of Xi and X/_i. One 
can then trade off variance in fine/expensive estimators against discretization error 
in coarse/cheap estimators with lower variance by setting a decreasing sequence of 
Monte Carlo estimator sample sizes, No> N\... >Nl- The overall computational cost 
of the MLMC estimator is 

L 

( 2 . 8 ) Cml = '£’^7^N,T, 

1=0 

where, ^ defines the computational cost of propagating one single sample (of X/) 
through a discretized system. On the other hand, the cost of the standard Monte Carlo 
estimator in (2.1) is 

(2.9) Cue = hL^^NT. 

One can choose the continuous variables Ni, 1 = 0, at a rate that minimises the 
variance of the MLMC estimator for a fixed computational cost, Ni ^VihJ. In 
particular, by following a formula given in [9, 6], one can find optimal values of Ni, 
as well as the finest level L, and in doing so achieve a computational cost reduction 
relative to the standard Monte Carlo counterpart (2.1), with the same bound Mean 
Square error. Giles [9] proved the following result. 

Theorem 2.1. If the Mean Square Error ofXi is bounded by 0{£^), one can 
optimally choose L and Ni to allow the computational cost of the MLMC estimator to 
be bounded by, 


( 2 . 10 ) 


Cml < < 


cie 

C2£^^log{£f, 




- 2 - 


(r-)3) 


y<p 
Y = p 
Y>P 


where ci, C2, C3, 7, p are positive constants and a > jmin{Y,p). 

For Xf^^ to have a Mean Square Error of O(e^), a sample size N of 0{£^^) is re¬ 
quired, as well as a discretization bias given by hi = 0(e «). Thus any of the compu¬ 
tational costs in Theorem 2.1 are less than Cmc (f9(e^^^“)). The MLMC approach 
in principle is very simple to implement and can be very effective as long as one can 
satisfy the two constraints, a > ^min{Y,p) and P > 0. More detail on this method 
can be found in a generalised explanation, and related theorems, in [6]. 

3. Particle Filtering. This section will outline the standard particle filtering 
methodology. In this context, one is interested in computing statistics of a random 
process Xt, conditioned on observations of a single realisation of X, denoted X', and 
referred to as the reference solution. The observations are random variables of the 
form 


( 3 . 1 ) 


Y,,=H{X;^) + ^ 
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at times t £ {ti,... ,1^ ), where H is an observation operator, and <p is a. ran¬ 
dom variable representing measurement error. For simplicity, we choose (j) N{0,R), 

/? is a y X y covariance matrix, and we will take H to be the identity (so y=d). We 
define to be a numerical discretization of with discretization accuracy param¬ 
eter hi. Our aim here is to sequentially approximate the expectation of 

with respect to the measure r]i n > where 7]^ ^ is the posterior of X^t given the 
observations ,,,,, 4 . Let p{y\x) be the likelihood function of y given x and q{x) be the 
prior of x. Then, for any k G [l,Xv], using Importance Sampling [7], one can draw N 
i.i.d samples from the empirical approximation of the prior of X 4 4 , 

(3.2) q{XL,„) = fwL(Xi,^_J5(X4,4 

i=\ 


and denote the normalised importance weights of sample i G [ 1 ,X] to be 


(3.3) 


where 




MKfy) 

L%iMxUy 


(3.4) 




Thus, the filter weights are defined iteratively, starting from wl{X[^^) = 4. As the 
observations are given by a Gaussian distribution, the likelihood is 

(3.5) p(y„ixi,) = 

Finally, an estimator for the expectation of X 4 4 with respect to the posterior T]^ 4 is 


( 3 . 6 ) XL,t, = tMxU)Xlt,- 

i=l 

This estimator, despite being biased by 0{N^^) due to the normalised importance 
weights in a single importance sampling update, is consistent with E,^^,^[X 4 4 ]. This 
is to say, as N ^ 0 °, the estimator converges in probability to [Xu^]- 

Typically, importance weights become degenerate as k increases [4]. In this case, 
it is necessary to duplicate higher weighted particles whilst removing lower weighted 
particles; this is known as resampling. Resampling resembles an unbiased transfor¬ 
mation from the weighted ensemble, ^ ^ to an evenly weighted 

ensemble of resampled particles {X/^ The scheme outlined above is known 
as the Sequential Importance Resampling (SIR) method. For more information on 
this turn to [7]. The Ensemble Transform Particle Filter, the subject of this paper, 
uses Optimal Transportation [22] to implement this transformation, which we de¬ 
scribe next. 
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3.1. Ensemble Transform Particle Filters. ETPFs are a variant of linear en¬ 
semble transform filters (LETF) [19]. They present an alternative to the resampling 
step that takes place in the standard SIR methodology, replacing it with a linear trans¬ 
formation. The goal is to obtain a transformed set of evenly weighted particles, 
from the weighted set of particles with importance 

weights I wl {X[ ^j defining an empirical approximafion fo fhe posferior dis- 
fribufion r\L,tf This can be done wifh fhe following linear fransformafion, 

(3-7) = f fljjf/,,. 


^ori = \,...,N and j = \,...,N wifh non-zero enfries for /^-y. Here, Pij = 1. Eel 
Zi t^^ denote fhe discrete random variable wifh samples and associated probabilily 
vector WL{X[f^), i = I,...,N. Then lake to be fhe discrete random variable wifh 
samples X^^^, i = all wifh equal probabilily. For fhe ETPF, one creates a 

coupling belween and Zl^, denoted by Ihe malrix Tjj, size N x N, wilh non- 
negalive enlries. The coupling defines Ihe linear Iransformalion malrix in (3.7) as 
Pij = NTi j. This coupling can be found by solving a linear Iransporl problem by 
minimising Ihe expected Euclidean dislance belween Z^tk Zl^, subjecl to Ihe 
conslrainls 

( 3 - 8 ) = tTiJ = MxU)- 

i=i j=i 

This is in facl equivalenl to maximising Ihe covariance belween Ihe Iwo ensembles, 
since 

A,,, f ] = Ez,,, [||zl,4 f ] + Wla f ] • • ■ 

• • • - 2Ez,,,^ [zLfy] - 2Tr{Covz^^^ 2,,,^ [zLA^z^tS) ■ 

In a univariate case, we define an optimal coupling malrix, 7] y, as one which min¬ 
imises Ihe cosl function. 


(3.10) 


N N 

E E 


r=l7=l 



2 


Theoretical analysis of Ihe above Iransformalion is given in [19]. Once Ihe Irans- 
formed particles in (3.7) are found, Ihe posterior mean is now estimated by, 

<3-11) = 

If is imporlanl to note lhal Ihis linear Iransformalion, which is deterministic, will give 
Ihe same estimator as in (3.6), and Ihus does nol add considerable exlra variance to 
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the estimator from random resampling. This is a consistent estimator for the previous 
posterior mean estimator ( j, since 


(3.i2)Xi,. = Lfx4 = L££r,, 


N N 


N 


./=1 


N 




j=U=l 


j=li=l 1=1 


In the univariate case, the matrix Ti j can easily be found by an 0(^Nlog{N)) 
algorithm in [20]. This will become an important observation when discussing lo¬ 
calisation in the next section. The above constraints lead to a maximum of 2N — 1 
non-zero elements in Tij, leading to a very sparse matrix calculation, and thus the 
ensemble transformation process can be achieved in a 0{N) computational cost. The 
0[Nlog{N)) computational cost comes from the fact that one has to sort the univari¬ 
ate particles prior to the algorithm. In our numerical experiments at the end of this 
paper, this sorting was a negligible part of the ensemble transform computational cost. 
This allows one to be able to carry the ensemble transform out on every assimilation 
step without the computational expense of this being of a higher order of magnitude 
than the propagation of the particles in between assimilation steps, but more analysis 
will cover this observation in the next section. 

In the multivariate case, the same linear transport problem prevails, however one is 
required to minimise the cost function. 


(3.13) 


N N 

LLT.j 

r=l;=l 




Ljk 


-Kt. 


whose minimum defines the Wasserstein distance between Zi t^ and Zl^ ■ This can be 
solved in a 0[N^log{N)) computational cost, using algorithms such as the FastEMD 
algorithm [16]. However, this means that with many systems, this optimal trans¬ 
portation computational cost will dominate over model costs of the system and thus 
the scheme is not efficient. The model costs of the particle filter are defined fo be fhe 
compufafional cosf needed fo propagafe fhe N particles fhrough fhe sysfem in befween 
assimilation sfeps and is given in (2.9) (determined by a consfanf y). Thankfully, a 
technique called localisation can aid fhis problem, and can also provide a pivofal 
change fo fhe scheme when applying if fo high dimensional sysfems. 

3.1.1. Localisation. Localisation, a scheme frequenfly used in fhe EnKF for 
high dimensional systems, can also be applied fo fhe ETPE [5]. In fhe simples! form, 
localisafion applied fo fhe ETPE means fhaf one can reduce fhe compufafional cosf of 
designing a mulfivariafe coupling fo d fimes fhe cosf of designing a univariate cou¬ 
pling. Eocalisafion allows one fo consfrucf an individual Iransformalion in (3.7) for 
each of fhe d componenfs of a multivariate X^t^- A simple definifion for fhe local¬ 
isafion mafrix C [5] fhaf describes fhe spatial correlation sfrucfure of fhe ensemble 
could be 


(3.14) 


Cm,n — 


1 If \ / ^tn,n ^ 

“2(777:)’ (7777/ - ’ 


^loc,c 

otherwise. 
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Here m,n = are the indicies of the spatial components of 

(3.15) Sm^n = min ||m — n—N\,\m — n\,\m — n+N\^, 


and rioc.c is a constant. The above form for Cm,n explicitly takes spatial-periodicity 
into account. One can now decompose the linear transport problem in (3.13) into 
d separate linear transport problems, to find a coupling matrix i = 1,... ,N, 

j = l,...,N,for each component m = I,... ,d. The objective of these linear transport 
problems is minimising the cost function 


(3.16) 


N N 
i=ij=i 


m\ 




where 

(3.17) 




£C„,„(Xi,^(n)-X4(u)) , 

n—\ 


subject to the constraints 

( 3 . 18 ) = L L ^'./H = ^LiXU). 

i=i i=i 


Here, X[ (m) is the m’th component of X[ . Then, one can define the approximation 
of the marginal posterior mean for each m = 1,... ,N as 

(3-19) = 

where the transformed components are given by 

(3.20) x4(m) = £P,.j(m)Xi,^(m), 

and Pi,j{m) = Tij{m)N. Note the cost functions in (3.17) do not achieve the mini¬ 
mum of (3.13). When r/oc,c = 0, exhibiting the most computationally efficient sce¬ 
nario, one has the interesting case where d univariate linear transport problems need 
to be solved, thus transforming all components individually. One can simply use the 
univariate algorithm in [20], mentioned in the last section, with a computational cost 
of 0[Nlog{N)), for each linear transport problem, to get an overall 0(dNlog{N)) 
computational cost. In practice, when r/oc,c = 0, one can also reorder each of the 
transformed sets of components into the rank structure of the original ensemble. This 
preserves the copula structure [21] of the original ensemble. 

If the model costs of a system are less than that of the multivariate optimal transporta¬ 
tion, using r/oc.c = 0 is the only case in which the model costs can return to being the 
dominative cost in the ETPF estimator. Despite this, it is very reasonable to imagine 
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that this model cost will dominate that of the optimal transportation in some systems, 
especially for high dimensional Partial Differential Equations (i.e. where 7 is high). 
Localisation does have an effect on the performance of the ETPF by adding bias into 
the posterior mean approximation stemming from the fact that one is generating de¬ 
terministic couplings Tij that are minimising different (simplified) cost functions to 
the full, multivariate one in (3.13). This bias is thus caused by the decay in correla¬ 
tion between the components. Despite this, numerical experiments conducted in [5] 
find the localised ETPF to be effective even in the chaotic, highly nonlinear Lorenz 
Equations. 

Localisation is also needed in the likelihood evaluation of multivariate particles in the 
ETPF. Although this is not critical to the aim of this paper, only briefly covered here, 
it is essential for the ETPF to be able to successfully filter high dimensional systems 
due to the curse of dimensionality. Standard Sequential Monte Carlo (SMC) methods 
fail to track high dimensional systems due to exponentially degenerate importance 
weights. However, while there have only been some suggestions for a solution to 
this problem in SMCs, such as in [18], one can alter the above localisation scheme in 
the ETPF to solve this problem swiftly [5]. It is also needed to reduce the computa¬ 
tional cost of likelihood evaluations when the dimension of the state space is greater 
than the sample size {d > N). For each component {m = 1,... ,d) in the particles 
{X [generate an separate importance weight given by 


-2 ^ 



where 



(3.22) 


for n = 1 ,... A {C,n is diagonal) and the value of r/oc,R can be independent to r/o^.c- 
Of course, H should be a local operator, see [1] for details of the use of localisation 
within the EnKF. These weights are then used in the constraints in the linear transport 
problems for each individual component transformation in (3.18). The two ‘radii’ of 
localisation, rioc^c and r/oc,R will henceforth be refered to as the particular settings of 
localisation used. 

4. Multilevel Ensemble Transform Particle Filter (MLETPF), The proposed 
multilevel ETPF framework is demonstrated in this section. It creates an estimator 
consistent with the standard ETPF estimator in (3.6), for the same discretization ac¬ 
curacy level, L. The term ‘single level’ estimator will henceforth be a reference to the 
corresponding standard ETPF estimator, conditioned on the same observations, with 
the same discretization level L and variance as the proposed MLETPF estimator. The 
general premise of the MLETPF is to run L -|- 1 independent ETPF estimators, with 
Ni samples, forward in time (and space), in the coupled multilevel framework. When 
updating the weights of each particle in each estimator, the same method as the ETPF 
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holds for each of the L + 1 estimators. Thus, we define the MLETPF estimator of 
[Xi fj], as the following, where the importance weights target the 

posterior of each r/-dimensional discretization X/y^., given the observations, 

ke[l,Ny], 



)) 


(4.1) Xl,4 


We assume here that ho < At, where At = tk+i —tk for all ^ G [IjXy — 1], so that all of 
the L + 1 estimators are conditioned on the same observations. This does mean that 
one has to set a bound on the frequency of the data assimilation, given the time-step 
of the minimum level, Hq. We note that it is possible to adjust the framework here 
slightly to incorporate frequent observations only available on finer levels at certain 
times. This could be done by only updating the weights for finer ensembles af those 
observations and then proceeding with the ensemble transform stages when both the 
coarse and fine levels in each difference estimator have had an importance weight 
update. 

One notes that as the standard ETPF estimator for each level of descritization, Z > 0, 
is consistent with [X/^^], the above estimator is consistent with the [X^,^], 
given the linearity of expectation shown in (2.2). Here, each of the particles from 
the fine and coarse ensembles in each of the multilevel difference estimators are pos¬ 
itively correlated in between assimilation steps as in the standard MEMO method. 
This correlation is required for the variance of each difference estimator to decay 
with Z —)• oo as discussed in the opening section. However, now in the ETPF con¬ 
text, when one comes to transform the fine and coarse ensembles in each multilevel 
difference estimator, the two ensembles cannot be transformed independently of one 
another, and need to have a positive correlation imparted between them ready for the 
next phase of particle propagation, especially if the transformations are happening 
frequently. If the random input to the system is simply a random initial condition, 
in a system with no stochastic forcing, these particles from the fine and coarse en¬ 
sembles will certainly diverge instantly if they are not positively correlated after the 
ensemble transformations. In this paper, this positive correlation is achieved using a 
multilevel coupling step after the standard ensemble transform stage. This requires 
one to first carry out the ensemble transform (3.7) on the coarse and fine ensembles. 



respectively, to get evenly weighted particles, {X/ ^ tk]i=\ Ni Nr 

localisation is needed, one can implement this with the required parameters on both 
ensemble transforms as in the last section. It is very important that the same local¬ 
isation settings are used on all estimators, so that the overall MEETPF estimator is 
consistent with the single level ETPF estimator with the same localisation settings. 
The key point of this being that the fine and coarse ensembles from each discretiza¬ 
tion level will have the same systematic localisation bias as one another. This means, 
such as with the discretization bias, that the localisation biases can cancel each other 
out in the telescoping sum of estimators (4.1), leaving only the systematic localisa- 
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lion bias of the finest level equal to that of the localised single level estimator. At this 
point, one notes that (4.1) becomes 


(4.2) 


= 


1 ^0 


+ I 

/=! 


I Ni 


Now one needs to positively couple the fine and coarse ensembles of transformed 
particles from each estimator above. We propose to build another coupling between 
Xin and denoted by , that minimises the cost function 


(4.3) 


with constraints 


Ni Ni 

EEL. 

r=l;=l 


F/C 


1 /, -l/_, , 


2 


(4.4) 


V = — V = — 

hh 


This is an assignment problem and in the multivariate case it can be solved by the 
Hungarian algorithm [15] with a computational cost equal to the multivariate linear 
transport problem algorithms discussed previously and so is the same order of mag¬ 
nitude as the corresponding ensemble transform stage in the standard ETPF method. 
In the univariate case, one can simply use the cheap algorithm in [20], exactly like 
the ensemble transform stage. One notes that the above assignment problem returns a 
coupling with one element in each row and column (^), resulting in particles simply 
being reordered and not transformed. This therefore returns exactly the same trans¬ 
formed particles in each ensemble. The reordering can be seen as finding the trans- 
formation matrix P. ■ = T- ■ Ni and then applying the standard ensemble transform, 

in (3.7), to both the fine and coarse transformed ensembles to get new ensembles of 
1^/4 }i=i Ni {^/-i Ni which are now positively correlated. Each multi¬ 

level difference estimator can now be estimated by 

1 N, 


Using a calculation similar to (3.12), we now show that the estimator in (4.5) is 
consistent with the term 




(4.6) 
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from Equation (4.1). Let Tfj and Tfj {i,j = I,... ,N) be the coupling matrices used 
for the ensemble transform on the finer and coarse ensembles respectively, then 


(4.7) 


1 Ni N, Ni 

7=1 i=i 


Ni Ni 


Ni 


= vE -J-fAt,,,.) = E {»,(x 4 )x 4 

The estimator in (4.2) can therefore be written as 


1 


No 


L . I N, 


a '.A'- ft 


(4.8) Xt,. = (j^E^L)+E(^L (xi.. -X/-.,J). 

with covariance 

(4.9) 
where 

(4.10) V/ = 


CovlX^J = t^, 

1=0 


fcov[:?o,4]) ^ = 0) 

Z>0, 


due to the fact that each estimator in (4.8) is independent. The above estimator is con¬ 
sistent with the single level (localised, with the same settings as used in the MLETPF) 
ETPF estimator due to (4.7) and (4.1). Therefore, in the absence of localisation, it is 
also consistent with [TQ.,?*]- 

The multilevel coupling minimises the expected distance between the two 
transformed ensembles, which maximises the covariance between them via (3.9) and 
then finally minimises V/. The multilevel coupling procedure above minimises V/ 
in each multilevel difference esfimafor; we also ensure fhaf pairs of particles in fhe 
coarse and fine ensembles are posifively coupled in befween assimilation steps (by 
using fhe same random inpuf). The aim of fhis is fo make fhe covariance V/ de¬ 
crease af an asymptotic rafe 0{hf) (j8 > 0) required for fhe variance reducfion of 
fhe mulfilevel framework fo work. This is because fhe fine and coarse ensembles 
are coupled bofh in befween assimilation sfeps and during fhem via fhe coupling. 
This will be demonsfrafed in numerical experimenfs later in fhe paper. Designing 
fhis coupling befween fhe bofh fransformed ensembles and Xi t^ is fhe key to 

fhe proposed MLETPF mefhod, and enforces correlafion amongsf bofh fransformed 
ensembles whilsf remaining consisfenf wifh fhe single level ETPF esfimafor. Mosf 
importanfly if also suifs fhe ETPF mefhod since fhe coupling can be generafed simply 
and cheaply when using fhe r/oc,c = 0 localisation fhaf can be used freely in fhe ETPF. 
This will be shown later. 
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4.1. Algorithm. In this section, we present an algorithm to implement the MLETPF 
in practice. In this paper, a pre-defined recurrence relation for the decay of A/ as Z —)• oo 
will be set (A/+i = f{Ni)) and these sample sizes will be kept fixed throughout the 
filtering process. The finest discretization level L > 0 will be kept arbitrary for now. 
The algorithm is now presented. 

1. Start at to (thus k = 0) and with 1 = 0. Choose Nq. 

2. Calculate Ni = if Z > 0. Sample ~ ^ ~ ^ 

{^Oo’^/-Lfo};=L...A, ~ where A/,^ =^/-i/o if ^ > 0> attimeto- 

3. Propagate all samples forward according to system dynamics until time 

If Z > 0, the fine and coarse pairs of samples in each esfimafor musf be coupled by 
using fhe same random inpuf. 

4. Derive fhe normalised imporfance weighfs for : 


(4.11) 




tk+l 




5. Transform fhe ensembles, and 

info fhe evenly weighted ensembles, and {^/_i using fhe 

linear fransformafion in (3.7). Couple fhem using fhe mulfilevel coupling mafrix 

fo produce reordered ensembles and {^/_i tk]i=\ n,' 

6 . Move on fo Step 7 if Z = L or if nof, iferafe Z+ = 1 fhen repeal sleps 2-6 for 
k = 0 or steps 3-6 for k > 0. 

7. Iferafe k-i-=l. Slarl again from step 3 wilh 1 = 0. The MLETPF approxima- 
lion of [Xl^,^] is given by (4.8). 

4.2. Computational Cost of the MLETPF and ETPF. As previously noted, in 
a multidimensional case, it is computationally expensive to generate the coupling ma¬ 
trices needed to couple the fine and coarse transformed particles. Localisation is used 
to reduce the computational cost of the ensemble transforms down to 0(dNlog{N)) 

when rioc,c = 0 in the standard and multilevel ETPF methods, and this too can also 

F /C 

reduce the computational cost of generating the multilevel coupling T^ ■ . When lo¬ 
calisation is used along with rioc,c = 0> one can break the multivariate coupling 
down into d separate univariate couplings. As the components are transformed indi¬ 
vidually, one can simply find a coupling for each individual component m 

in the transformed coarse and fine ensembles with the cost function. 


(4.12) 


Ni N, 
i=].j=l 


and the same constraints as in (4.4). Each of these couplings can again be found us¬ 
ing the cheap, 0{Nilog{Ni)') (for each Z) univariate algorithm in [20]. One can then 
reorder / transform each component of fine and coarse ensembles separately using the 
same methodology as in the last section. This performs a similar role as resampling 
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Ni particles from and F^J,{u) where the marginal (empiri¬ 
cal) inverse cumulative distribution functions of Xify{m) and respectively. 

Here the same uniform variate u G [0, 1] is used for each pair of the Ni samples. If 
localisation is carried out along with rioc,c > 0, the computational cost of the optimal 
transport in the standard ETPF and thus the multilevel coupling, minimizing the full 
multivariate cost function in (4.3), in the MLETPF will rise to 0[dNflog{Ni)). This 
is because d different localised, but still multivariate, optimal transportation problems 
will have to be solved. Therefore, in this scenario, the model costs are likely to be 
dwarfed by these optimal transportation costs, which are fixed by definition. In this 
case there is no justification for implementing the multilevel framework as it only 
aims to reduce model cost and not the optimal transportation computational expense. 
However to consider the case where model cost does dominate that of the multivari¬ 
ate optimal transportation, the full multivariate coupling will be demonstrated in the 
numerical exeriments at the conclusion of this paper. 

This paper now considers the overall computational cost of the ETPF and MEETPF 
estimators when localised, with r/oc.c = 0. In this case, as explained above, one 
can reduce the computational expense of not only the ensemble transform stage, 
but the multilevel coupling stage as well, in the MEETPF scheme from a poten¬ 
tial 0[Nflog{Ni)) to 0(dNilog{Ni)) for each multilevel difference estimator. This 
is enough, with suitable assumptions, to expect that the model computational cost 
bounds for the standard MEMC method in Theorem 2.1 are of the same order of 
magnitude to that of the entire MEETPF, including the ensemble transform and cou¬ 
pling stages, as one can simply ‘hide’ the optimal transportation costs behind the 
particle propagation costs. This follows from the proposition. 

Proposition 4.1. If At > is constant, and one can bound the computational 
cost of all Ny ensemble transform / multilevel coupling stages (with the last term 
being the cost associated to the sort prior to the algorithm) of the MLETPF by, 

(4.13) Get < X (dciN,Ny + deyNilog{Ni)Ny) 

1=0 ^ 

where c\ is a positive constant, then the total computational cost of the MLETPF is 
bounded by, 

(4.14) Cmletpf < ic 2 NiCidtEi^+ de\Nilog{Ni)N^ 

where Cij is the cost of propagation of one particle on level I (dependent on d), ei is 
a positive constant and C2 is a positive constant. 

Proof Eet the total computational cost of the MEETPF be given by the sum of 
the model cost and the ensemble transform cost (including the localised likelihood 
evaluation in (3.21), which scales at 0{Ni) due to the sparse diagonal dxd matrix C, 
with rioc,R assumed constant (riocp = 0 ( 1 )) and << d) 


( 4 . 15 ) 


Cmletpf — Get +Cmodel 
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then bounding Cet as in the elaim and using the model cost in (2.8), 

Cmletpf < ^ {^lidciNy + Ci^dC^tNy) + de\Nilog{Ni)N^ 

(4.16) < (Ni{dc4tNy + Ci^dCstNy) + deiNilog{Ni)Ny'\ 

1=0 ^ 

< 52 i<^2NiCi^dtNy+ de\Nilog{Ni)N^ 

Here we assume that Cu will grow at least linearly with d, C4 = ^ and that C3 is a 
positive constant. □ 

The last term of each of the expressions above comes from the sorting of the uni¬ 
variate particles in the cheap algorithm. Although this computational cost scales at 
0{Nilog{Ni)) , the constant e\ is typically very small with respect to the other con¬ 
stants in the cost expression; furthermore this scaling is a worst case estimate, in 
many cases this would not be reached. Therefore assuming that this part is less than 
the remainder of the localised ensemble transform cost, one can bound the overall 
cost of the MLETPF by the model cost, and keep the corresponding reductions out¬ 
lined in Theorem 2.1. However, even if this sorting cost does dominate the ensemble 
transform cost, one still expects to recover computational cost reductions relative to 
the single level estimator, for a fixed error. Despite the model cost in this case not 
neccessarily being the highest computational expense in the localised MLETPF, with 
the cost of the sorting / ensemble transform dominating, it is important to note that 
for a fixed error bound, fhis sorfing cosf / ensemble Iransform will sfill fypically be 
less fhan fhe model compufafional cosf of fhe single level ETPF. The reducfions in 
Theorem 2.1 would fhen be a slighf underesfimafion however fhere would sfill be evi- 
denf reductions of compufafional cosf relafive fo fhe single level ETPF. This is indeed 
fhe case for fhe numerical examples in fhe nexf secfion, as we show fhere. 

4.3. Numerical Examples. Numerical examples of fhe MLETPF mefhod, ap¬ 
plied fo classical dafa assimilation problems, are given in fhis secfion. Three prob¬ 
lems will be sfudied: fhe mulfivariafe, chaotic Sfochasfic Lorenz-63 Equafions, fhe 
univariafe, buf nonlinear double-well OU Process and fhe high dimensional Sfochas- 
fic Lorenz-96 Equafions. The algorifhm above will be used fo generafe experimenfal 
MLETPF esfimafors (fhaf are compared againsf fhe single level ETPF esfimafors) for 
fhe latter two of the three problems above, along with varying levels of pre-defined of 
accuracy. This pre-defined level of accuracy, 0{e), will defermine L (fhe finesl level 
/ overall numerical discretization bias), fhe fixed sample sizes for each esfimafor in 
fhe MLETPF mefhod (A/) and fhe fixed overall sample size in fhe corresponding sin¬ 
gle level ETPF esfimafor required fo achieve fhe order of magnifude of fhis error in 
bofh esfimafors. This follows from fhe sfandard Monte Carlo approximation error 
decomposifion given by fhe Cenfral Limif Theorem, as in [9]. One can fhen compare 
fhe compufafional cosf, given as fhe number of operations, for bofh fhe single level 
and multilevel estimators, which should be in line wifh Theorem 2.1 given fhe same 
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pre-defined error. The error is not bounded exactly due to variations in the variance 
at each assimilation step, but a proof of concept from a practioner’s viewpoint can be 
established. 

The error in the estimators will be estimated by the time-averaged root mean square 
error (RMSE), given by 


(4.17) 


RMSE = 



2 


where is the posterior distribution of given the observations An ap¬ 

proximation of [Xf^] will be used in the RMSE calculations above, by computing 
a standard ETPE estimator for it, of which the numerical discretization bias and sam¬ 
ple size produce an estimator with an error orders less than any e used in the following 
experiments. Where localisation is used in the single level and multilevel estimators 
for the problems above, the estimators are inconsistent with but crucially 

consistent with each other as the same localisation settings are used. The ETPE ap¬ 
proximation of will use the same localisation settings to correctly compare 

the ETPE and MEETPE estimators like for like. 

The recurrence relation for A/, I > 1, given Nq (dependent on e), used in both 
numerical experiments will be set to A/+i = , however the optimality of 

this depends on the relative value of p with respect to 7 [9]. This is only optimal 
for j 8 > 7 , (j 8 -|- 7 ) = 3; this is the case in the numerical examples below. One notes 
that for a RMSE of 0{e), one requires that A/ = and that A = 0{e^^) for 

the single level ETPE. Also, the discretization bias of the multilevel and single level 
ETPE estimators, from E^,^ should be 0{e). Thus, for a numerical scheme that 
has a global discretization bias of one requires 


(4.18) 


log{{t^d)/e) 

alog{M) 


for the sum of the multilevel and single level ETPE estimator components bias to be 
of 0{e). One could also use the maximum among all components’ biases to be a 
suitable measure here. The sample covariances of the independent ETPE estimators 
will also be measured by a sum among all components: Tr{Yi), the trace of the co- 
variance matrix. 

Eor the single level ETPE, using the above analysis, one can see how requiring 
hi = 0{e) and A = the model cost in Equation (2.9) will be dom¬ 

inating the localised (with rioc^c = 0) ensemble transform cost 0(dNlog{N)), which 
is equivalent to 0(de^^log{e^^))', this supports the point made in the last section. 
This is also greater than the localised ensemble transform cost of the MEETPE, 
0(dNilog{Ni)), again equivalent to 0(de^'^log{e^^)). Thus the computational cost 
reductions of the MEETPE, even if the sorting cost of the localised ensemble trans¬ 
form dominates, is apparent in these cases. The computational cost for the these 
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numerical examples is defined as the theoretical number of operations needed to com¬ 
pute each approximation, including the ensemble transform stage and the multilevel 
coupling for the MLETPF. This is computed simply by inserting a step-counter into 
the numerical implementation (functions etc.) in the Python code used for these ex¬ 
periments. Finally, M = 2 is used for each numerical example. 


4.3.1. Stochastic Lorenz-63 Equations. This simple 3-component chaotic non¬ 
linear system inX = {x,y,z). 


(4.19) 


dX 

dt 


< x{p-z)-y + ^^, 

U-i8z + 0f, 


with p = 28, a = 10, j8 = 8/3, 0 = 0.4 will be used to demonstrate the effect of 

F /C 

the multivariate multilevel coupling, T^ ■ , without localisation and thus with cost 
function in (4.3). The localised coupling will also be used as a comparison. Here, 
the Brownian motion W will be the same for each component to keep the strong 
nonlinearity in the equations. Computational cost against accuracy comparisons with 
the standard ETPF method will not be investigated here given the low model cost 
of this test problem and thus the dominating effects of the multilevel coupling and / 
or the ensemble transform stage in both the ETPF and the MEETPF will make the 
model cost reductions of the multilevel framework unnoticeable. The MEETPF es¬ 
timator with Nq = 500, and L = 5, using the full multivariate cost function in (4.3) 
to find the multilevel couplings T-■ in the aforementioned algorithm, is used to 
compute an approximation to with X as above, for k G [1,5120], with 

/lo = = At, and thus = 40. Here, X^ is the solution to the above Eorenz-63 

equations using the forward Euler numerical scheme. The reason that we choose the 
minimum level to be equivalent to Z = 7 is for stability when using the Euler method. 
Using different numerical methods, with greater stability at greater time-steps, and 
thus lower levels would be able to decrease this minimum level. The observations are 
given by a measurement error with R = 21, where 1 is the 3x3 identity matrix and 
weights are thus based on observations of all components x, y and z. Figure 1 shows 
the mean estimates of Tr{Yi) (Z G [1,5]) over all assimilation steps k G [l,A^y]- The 
asymptotic decay of the above estimates show importantly that the multilevel cou¬ 
pling, with the multivariate cost function, successfully produces the variance decay, 
Tr{Yi) = 0{hf), in this case j3 « 1. The figure also shows the value of j3 (variance 
decay) for the case where r/oc,c = 0 localisation for the coupling. Eocalisation being 
used in a problem such as the strongly nonlinear Eorenz 63 system is dangerous due 
to the decay in correlations between components, but with the parameters above, it is 
used simply to compare the rates of variance decay with the non-localised case. One 
recovers j3 « 2 in the localised case; this showing that refining the optimal transport 
down to the one dimensional localisation case is beneficial for variance reduction but 
comes with the sacrifice of inconsistency of the reference “single level” estimator . 



18 


A. GREGORY, C. J. COTTER AND S. REICH 



Fig. 1: Mean, over all assimilation steps, k G [l,A^y], of the estimates ofTriyi), with 
I G [1,5] for the Stochastic Lorenz-63 Equations. Both the non-localised and the 
f'ioc,c = 0 localised cases are shown. 


4.3.2. Double-well OU Process. This is a univariate test problem that demon¬ 
strates the eost effeetive, consistent, MLETPF estimator of where is 

a numerical discretized solution to the double-well, nonlinear OU process, 

(4.20) dXt, = -V' (Xy, )dt + ^ dWt, 

k G [l,Xv] and IFy^, is a standard Brownian Motion. Here, V{Xtf) = |Xy^ — ^Xy^. This 
example uses hi = 2^^^^ but this is arbitrarily chosen. The stochastic forcing is set 
to (§ = 0.5. The observations and assimilation times were given by /? = 0.6, t^y = 50 
where At = ho and so Ny = 800. The numerical discretizations of Xy^ are computed 
by the Euler-Maruyama numerical scheme. The parameters above produce a stable 
numerical solution for a single realisation of the above system when using this scheme 
for the time frame above. A very accurate simulation (Nq = 10000, L = 7) of the 
MEETPE estimator is run to demonstrate the mean asymptotic decay of V/ and |X/ y^ | 
(/ G [1,7]) over all assimilation steps. These are shown in Eigure 2. The values of 
a « 1, j3 PS 2, are as expected given the Euler-Maruyama global discretization bias 
of 0{hi) and the additive noise in the OU Process contributing to the variance. 

Eigure 3 shows the computational cost against the accuracy (RMSE) for the 
MEETPE and the single level ETPE estimators over varying values of e. Here one 
sets Nq = for the MEETPE and N = for the single level ETPE estimator. 
One can clearly see the expected orders of growth for the computational cost of the 
standard ETPE (C?(e^^), as 7 = 1 and a = 1) and the MEETPE (C?(e^^), given that 
7 < j 8 ) that were shown in Theorem 2.1 for the pre-defined RMSE of 0{e). 
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Fig. 2: Mean, over all assimilation steps, k G of the estimates ofYi (variance) 

and (expectation) with I G [1,1] for the double-well OUprocess. 


4.3.3. The Stochastic Lorenz-96 Equations. The final numerical test for the 
MLETPF method in this paper is the high dimensional (d = 40 in this case) Stochastic 
Lorenz-96 Equations, given by, 

, 4 . 21 ) 

^ ' dt 3AX dt ’ 

with j G [1,A^J> fVx = 40, AfrAv = 10 and thus AX = 0.25; dWj/dt are 40 i.i.d. Brow¬ 
nian motions. Here, F is a constant forcing (F = 8) and = 0.4. Periodic boundary 
conditions are used, so that X (—1) = X (Nx) ■ The observations were given by a mea¬ 
surement error of /? = 61, where I is the 40 x 40 identity matrix and assimilation 
times were set to t^^ = 100 where At = ho meaning Ny = 1600 as h; = 2^^^' (again 
simply arbitrary). The Euler-Maruyama method is used to find Xi^ once again here. 
In this numerical example, the ETPE and MEETPE estimators use rioc,R = 1- The 
localisation setting of rioc,c = 0 is used for both the multilevel and single level ETPE 
estimators here, due to the model cost, Cij, being simply equal to h^^d, and thus 
much lower than that of high optimal transportation costs in multiple dimensions. 
Once again, a very accurate simulation (Nq = 1000, L = 10) of the MEETPE was 
generated to demonstrate the mean asymptotic decays of |X;f^(/)| and TrfVi) 
(I G [1,10]) over all assimilation steps and these are shown in Eigure 4. These follow 
the same, expected, values of a ss 1 , jS « 2 , as with the last example. 

Next, the stability of the MEETPE is considered. Since N; is fixed and does not 
change to bound error over time, we can look at the errors from the reference solu¬ 
tion, Xj'^, compared to the observational errors, to study the stability of the MEETPE 
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Fig. 3: Computational Cost (number of operations) against the time-averaged RMSE 
of the ETPF and MLETPE estimators for the double-well OU process. Reference 
lines show the orders of decay of RMSE^^ and RMSE^^. 


estimator over time and check that the errors do not increase. One expects that, for a 
successful particle filter, the errors from the estimator should be less than the obser¬ 
vational errors and remain stable. Figure 5 shows this expected behaviour, where the 
cumulative time-averaged RMSE values from of both the observations and the 
MLETPF estimator (using arbitrary values of Nq = 1000 and L = 10) are shown. Eor 
k G [l,Xj.], these are defined fo be. 


(4.22) 

for fhe observafions and. 




2 


1=1 


(4.23) 


1=1 


for fhe MEETPE esfimafor. To demonsfrafe fhe sfabilify of fhe variance of fhe particle 
filler, cumulative fime-averaged RMSE values for fhe second momenfs of Xi^ tk Yt^ 
are shown in Eigure 6. Einally, fo compare fhe MEETPE wifh if’s sfandard counlerparf 
for fhis sel of equations, Eigure 7 shows fhe compulalional cosl againsf fhe accuracy 
(RMSE) for fhe sfandard ETPE and fhe MEETPE eslimalors over varying values of 
£. Once again, one sefs Nq = for fhe MEETPE and N = for fhe single 
level ETPE esfimafor. This follows fhe successful cosl reduclions achieved in fhe Iasi 
example, defined in Theorem 2.1. 
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Fig. 4: Mean, over all assimilation steps, k G [l,A^y]. of the estimates ofTr(Yi) (vari¬ 
ance) and Lf=i|^/A(0| (expectation) with I G [1,10] for the Stochastic Lorenz-96 
Equations. 


5. Summary and outlook. This paper has demonstrated a proof of concept for 
the application of MLMC to nonlinear filtering. The Ensemble Transform Particle 
Filter (ETPF), coupled with localisation, allows one to simply and cheaply carry out 
a multilevel coupling between each fine and coarse ensemble in each independenf 
Monfe Carlo esfimafor in fhe MEMC framework. A recenf sfudy has also proposed a 
framework fo apply MEMC fo nonlinear filfering wifh a modified random resampling 
step in fhe sfandard parficle filfering mefhodology fo couple particles from coarse and 
fine levels [12]. In confrasf, fhe coupling in fhe presenf paper is designed fo minimise 
fhe Wassersfein disfance befween fhe disfribufions of fhese Iransformed ensembles 
(in fhe sfandard ETPF mefhodology), originally suggesfed in [8]. If has been shown 
fhrough numerical experimenfs fhaf one can restore positive correlation befween fine 
and coarse ensembles which mighf have been losf if fhey had been Iransformed inde- 
pendenfly of one anofher. This in furn safisfies fhe neccessary consfrainfs on fhe 
sample variance of each independenf mulfilevel esfimafor, allowing fhe proposed 
MEETPF mefhod fo reduce fhe compufafional cosl of fhe propagation of particles 
in fhe localised ETPF mefhod. 

In general, localisation wifh rioc,c = 0 makes fhe compufafional cosl of Ihis cou¬ 
pling and fhe ensemble Iransform in fhe MEETPF cheap enough fhaf fhe mulli- 
level framework can relurn overall compufafional cosl reduclions from fhe sfandard 
ETPF mefhods; fhe aim of fhe paper. If musl be noted fhaf alfhough fhis paper has 
only louched on fhe case where fhe very crude rioc,c = 0 localisalion is considered. 
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Fig. 5: The cumulative time-averaged RMSE of the observations and MLETPF esti¬ 
mator away from the reference solution, Xl^,for the Stochastic Lorenz-96 Equations. 


due to small model cost test problems, this method could also be applied to high¬ 
dimensional systems where the model cost dominates that of the multivariate optimal 
transportation. One can do this without any crude constraints on r/oc,c using the full 
multivariate coupling methodology presented in this paper and demonstrated numer¬ 
ically. However whether the variance decay of V/ from such a multivariate coupling 
would hold, producing as strong results, in the limit of d » N unknown and thus 
the issue of how one would adjust this coupling to be used alongside other values 
of rioc,c to reduce the dimensionality of these multivariate couplings remains to be 
explored. 

Iterative and approximate schemes for solving discrete optimal transportation prob¬ 
lems have been an area of rapid research in the last few years [17] and this offers 
the chance to improve the multilevel coupling in the proposed method by reducing 
computational cost. This could be done by trading-off between the optimality and 
computational cost of the coupling for each Z, e.g. more expensive / optimal cou¬ 
plings for greater I with lower sample sizes Ni. 

The form of the coupling used in this paper is simple to implement and has the poten¬ 
tial to be used in plenty of applications, in and outside of data assimilation, whenever 
one wishes to establish consistent correlation between two distributions for variance 
reduction. Considering an extension for the multidimensional example presented in 
this paper, one could also apply a spatial multilevel framework, setting the spatial 
resolution (AX/) to be dependent on the level of discretization, as done in [6, 2] to 
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Fig. 6: The cumulative time-averaged RMSE of the second moment of the observa¬ 
tions and MLETPF estimator away from the second moment of the reference solution, 
{XlJ'^yfor the Stochastic Lorenz-96 Equations. 


gain even more significant cost reductions. 
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Fig. 7: Computational Cost (number of operations) against the time-averaged RMSE 
of the ETPF and MLETPE estimators for the Stochastic Lorenz-96 Equations. Refer¬ 
ence lines show the orders of decay of RMSE^^ and RMSE^^. 
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